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Abstract 

Let Er be a family of hyperelliptic curves defined by Y 2 = Q(X,T), 
where Q is defined over a small finite field of odd characteristic. Then 
with 7 in an extension degree n field over this small field, we present a 
deterministic algorithm for computing the zeta function of the curve 
by using Dwork deformation in rigid cohomology. The time complexity 
of the algorithm is C(n 2 ' 667 ) and it needs C(n 2 ' 5 ) bits of memory. A 
slight adaptation requires only 0(n 2 ) space, but costs time C(n 3 ). An 
implementation of this last result turns out to be quite efficient for n big 
enough. 

AMS (MOS) Subject Classification Codes: 11G20, 11Y99, 12H25, 14F30, 
14G50, 14Q05. 



1 Introduction 

The idea that it might be interesting to compute the number of solutions to 
an algebraic equation over a finite field without actually trying to find these 
solutions is an old one, as Gauss already introduced his so called Gauss sums 
for it. In later times this topic led to wonderful theoretical results as for example 
the Weil conjectures, the pursuit of a proof of which had a very big influence 
on number theory and algebraic geometry. In those days computers did not 
yet exist as everyday objects, hence there was no real interest in computing the 
number of points on very concrete varieties, and for example the ^-adic theory 
that led to the final proof of the Weil conjectures seems especially unsuited for 
implementations (except in the elliptic curve case). 

In the eighties the idea of Koblitz [TH] and Miller [53] to use elliptic curves 
for cryptography created a new interest in the subject, but now on this more 
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concrete level. Later on came also the suggestion of using jacobians of hyperel- 
liptic, C a b and other kinds of curves. Besides these concrete reasons the matter 
should of course be interesting already in itself: given a very big but finite ob- 
ject with an easy defining relation and lots of structure: what is its size? But 
there are even more reasons for investing in point counting algorithms, e.g. in 
[T] a one-way function is constructed which uses such an algorithm, and [3U] 
describes how to use jacobians in the context of sphere packing. 

The first interesting general algorithm that saw the light was Schoof 's algo- 
rithm for calculating the number of points on an elliptic curve. This algorithm 
has a time complexity polynomial in the logarithm of the field size, and was op- 
timized by Elkics and Atkin, thus resulting in the well known SEA-algorithm, 
see [27]. It is possible to generalize this approach to higher genus curves, but 
the complexity is then exponential in the genus, and this has only been done 
for genus equal to 2. 

As higher genus curves came into view more algorithms emerged. In practice 
we can distinguish two kinds of problems: for a field size p n (p prime), give an 
algorithm that works polynomially in \og(p n ) for small n, or one that works 
in time 0(np), hence for small p. The p-adic approach started with Satoh's 
canonical lift method for elliptic curves 26J. It was Kedlaya's famous paper 
[T5] that introduced the Monsky-Washnitzer cohomology in the computational 
world by giving a general algorithm for hyperclliptic curves in odd characteristic. 
These p-adic algorithms solve problems of the second kind, namely they are 
polynomial in np instead of n log p. But for large fields with small characteristic 
they have proven to be very efficient. Denef and Vercauteren generalized this 
algorithm to even characteristic [6] . 

In the meantime Lauder and Wan presented an approach |21 that led to an 
algorithm for very general varieties, and although not very practical, it works 
polynomially in the extension degree of the field (but exponentially in the num- 
ber of variables involved). Afterwards Lauder [32] used Dwork's deformation 
theory to reduce the dependency on the dimension of hypersurfaces0 The idea 
is to consider a family of hypersurfaces in one parameter L, such that for exam- 
ple r = 1 gives the original problem, and r = gives a special but easy case. 
Dwork's theory allows then to recover the necessary information in an efficient 
way by enabling to skip the most elaborate steps in Kedlaya's algorithm. We 
will make this more precise further on in this paper. Indeed, we develop a sug- 
gestion of Lauder to combine the Monsky-Washnitzer cohomology a la Kedlaya 
with a deformation. Although for curves the dependency on the dimension can- 
not decrease, for certain hyperelliptic curves the dependency on the extension 
degree n will. Namely, Kedlaya's algorithm results in a time complexity of 
C(n 3 ) bit operations and C(n 3 ) as bit space requirements, whereas our algo- 
rithm requires respectively C(n 2 - 667 ) and C(n 2 - 5 ), or with a small adjustment 
0(n 3 ) respectively 0(n 2 ). It is worth noting that we can find the matrix of 

1 A similar idea is used by Tsuzuki in the computation of Kloosterman sums [311 - 
2 See Section 2.4 for an explanation of the notation O. 
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the pth power Frobenius automorphism in essentially quadratic time, and only 
the final step, taking the characteristic polynomial of the norm of this matrix, 
requires the above time estimates. As we consider only odd p, a next project is 
the even characteristic case, these results can be found in (T3] . 

In [5] Gerkmann has followed roughly the same kind of ideas for the elliptic 
curve case, including a short description for characteristic two. An implemen- 
tation for Magma of this method is available on his websit^l- 

The paper is structured as follows. We start in Section 2 with a rough sketch 
of the method, and a formulation of our results, Theorems [1] and [2] Then we 
will construct the analytic setting in which the Monsky-Washnitzer cohomology 
with deformation lives, along with all the theoretical proofs. This theory implies 
the correctness of the algorithm, which is explained in Section 4, if we should 
work with infinite p-adic precision. As this is impossible, Section 5 proves that 
our chosen precision suffices. Section 6 computes the complexity of the algo- 
rithm and hence proves Theorems Q] and We then give some remarks and 
special interesting cases, and the final section presents results achieved with an 
implementation of the algorithm. 

The author wishes to thank very much Jan Denef and Alan Lauder for 
proposing the topic. He also wants to thank Lauder for providing some crucial 
suggestions, Wouter Castryck and Joost van Hamel for the helpful discussions, 
Denef for the thorough reading of the paper and the correction of many small 
mistakes, and the anonymous referees for their many valuable suggestions and 
comments. 

2 Overview of the method and results 

2.1 Introducing notation and denning the problem 

Let p be an odd prime, a > 1 an integer and q := p a . We write ¥ q for the field 
with q elements. For a field F its algebraic closure is denoted by F. Let Z p 
and Q p be respectively the p-adic integers and field, and Z 9 and Q q the unique 
unramificd extensions of degree a of Z p and Q p . If the field C p is a completion 
of Qp, then C p is algebraically closed as well. The corresponding valuation on 
these rings is written as ord, normalized to ovdp = 1. We will use a for the pth 
power Frobenius automorphism on C p . The projection 7L q — > ¥ q is denoted by 
x i — ► x, and the Teichmuller lift of x € ¥ q is the unique lift x € Z g of x that 
satisfies x q = x. For a polynomial a in X we denote with deg x ct the degree 
with respect to X, and similarly we define deg r a. Its derivatives are written 
as a' := -SvCst and a := -Spa. 



http: / /wwwalt . mathemat ik .uni-mainz . de/~gerkmann/e 11 curves .html 
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Suppose we are given an equation 

Y 2 = Q{X,T) with Qe¥ q [X,T], 

where Q is a monic polynomial of odd degree 2g + 1 > 3 in X and degree k in T. 
If we suppose moreover that Q(X, 0) has no double roots, then Y 2 = Q(X, 0) 
defines a hyperelliptic curve E of genus g in Weierstrass form. Let 7 € F g , let n 
be such that F g (7) = ¥ qn and suppose that Q(X,j) — which defines the curve 
— has no double roots. Then our goal is to compute the zeta function and 
hence the number of points of the curve Ej. 



2.2 Deformation in a nutshell 

For those readers who are unfamiliar with the Monsky-Washnitzer cohomology 
as used by Kedlaya in order to count points on hyperelliptic curves, we refer 
to Kedlaya's paper [15] or the course of Edixhoven [8pl . The idea of using 
deformation in the context of the Weil zeta function appears first in Dwork's 
paper j7j, where he derives the resulting differential equation. The cohomology 
type considered in this paper is rigid cohomology, for a general definition and 
properties we refer to Berthelot's paper [SJ. 

As Kedlaya we start with lifting everything to characteristic zero. Define 
Q(X,T) S Z q [X, T] as a degree preserving lift of Q(X, T). It is clear that for 
some values of 7 in ¥ q the polynomial Q(X, 7) is not squarcfrcc, or cquivalently 
Y 2 = Q(X,j) has affine singularities and hence does not define a hyperelliptic 
curve in Weierstrass form. With the resultant 

r(T) :=Res x (Q(X,T);Q'(X,T)) 

we have that r(7) ^ if and only if Q(X, 7) is squarefree, hence we will require 
that f (0) ^ 0. We assume for a moment that the leading coefficient of r is a 
unit in Z„. Let the ring S and the ^-module T be defined by 



S 



r, 



1 



r(T) 



and T := 



r, 



i 



r(T) 



X, 



1 



Note that these two structures are just variants of Kedlaya's Q q and A\ but with 
the deformation parameter T inserted in the right way. We have a differential 
operator d := -^dX : T — > TdX, and TdX/dT is a free S'-module with basis 

f X i dX X j dX 
I V 



i = 0,...,2g-lj = 0,. 



,2g}. 



Let H MW be the submodulc 



generated by B 

(H|) and © in Section 
basis. 

On the module H^ {W we have the operator F p , a lift of the characteristic p 



I X J^ i = 0, . . . ,2g — ij. There exist explicit formulae 
that enable us to reduce elements in H~^ w to this 



Frobenius automorphism, and the connection 



V : H 



MW 



J or 



4 Available on http : //www. math. leidenuniv .nl/~edix/oww/mathof crypt/. 
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With F(T) a matrix for F p and G(T) for V, we will prove the differential equation 

f(t) + f(t) ■ g(T) = pr p - 1 G <7 (r p ) J F(r). 

As the zeta function of is completely determined by F (7), where 7 is the 
Teichmiiller lift of 7, this matrix is the object that we want to compute. Indeed, 
the characteristic polynomial of a matrix of F£ n , which can be derived from 
^(7), will be precisely the numerator of the wanted zeta function. We will see 
that we only need to calculate everything up to a computable finite precision in 
order to find this zeta function exactly. 

2.3 Overview of the algorithm 

We sketch briefly how the algorithm exploits the theory above. Note that we 
always work with p-adics modulo a certain power of p and power series in T up 
to a certain power. 

1. Lift Q to characteristic zero, and compute the resultant r = aQ + j3Q' . 

2. Determine the matrix G of the connection V by differentiating the basis B 
with respect to T and using the reduction formulae ([I]) and ([2]) of Section 

3. Deduce from G the matrix C whose rows form a basis of the local solutions 
of V — by solving the differential equation C = — C ■ G. 

4. Compute (C^rP)) -1 . 

5. Compute F(0) by Kedlaya's algorithm. 

6. As F(T) = (C ff (rP)) _1 • F(0) • C(r), we get a series expansion for F(T). 

7. By representing ¥ q ™ and Q q ™ as explained in Section [6ril compute ^(7). 

8. Determine the numerator of the zeta function of as the characteristic 
polynomial of 

f^y^ 1 -f^y^ 2 ■■■F{ 1 y -f^). 

2.4 Results 

The main result of this paper is a 'subcubic' time algorithm for the compu- 
tation of the zeta function of certain families of hyperelliptic curves. Here we 
consider the extension degree n to be the crucial parameter. In the formula- 
tion of these results we use the O-notation as defined e.g. in [33], which means 
that we ignore logarithmic factors. The relevant examples are that O(nlogn) 
and O(nlognloglogn) are both 0(n). Note that we ignore the dependency on 
p, and all complexities are to be seen bitwise: for time requirements we mean 
bit operations, and space is also meant as number of bits. In these results the 
reader may take into account that a linear deformation will be most practical, 
which has k = 1. 
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Theorem 1 Let p be an odd prime and let be a hyperelliptic curve over 
F p a„ obtained by a deformation of the kind described above: Y 2 = Q(X, 7) 
with Q dehned over F p a . There exists a deterministic algorithm that calculates 
the zeta function of E~ in time 

6 (n 2m7 g 6 - 376 a 3 K 3 ) 

and in memory O (n 2 - 5 g 5 a 3 K 2 log 3 g) . We can also compute the same object us- 
ing O (n 2 g 5 a 3 K 2 log 3 g) memory at the cost of a time estimate O (n 3 g 6 376 a 3 K 3 ) . 

If we have done the calculation for one value of 7, it is faster to calculate more 
zeta functions from the same family. 

Theorem 2 Suppose we have as precomputation computed a sufficient approx- 
imation of the matrix of Frobenius of a family as in Theorem [TJ Then we can 
find the zeta function of a member of the family with parameter in ¥ p a n in de- 
terministic time O (n 2 667 g 5 a 3 K) and space O (n 2 ' 5 g 5 a 2 nlog g) , or O (n 3 g 5 a 3 Kj 
respectively O (n 2 g 5 a 2 K log g) . 

The proof of these theorems will be given in Section [6l after the complexity 
analysis. 

3 Analytic theory 

We have tried to make this section self-contained, to have a clear exposition of 
the analytic theory behind the deformation. As a consequence we will reintro- 
duce some notation, but sometimes in a more general context, e.g. we will work 
over an algebraic closure ¥ q of F g . 

In [15l [16] , Kedlaya presented a concrete computable form of the Monsky- 
Washnitzcr cohomology. In this section we combine this with a one-dimensional 
deformation as suggested by Lauder in [20 . As there are a lot of technical 
convergence results, we cannot present all the details of every calculation, but 
it will always be clear how to reconstruct those missing computations. 

3.1 Sketch of the situation 

Let ¥ q be a finite field with q — p a elements, p an odd prime, and suppose 
we are given a polynomial Q(X,T) g ¥ q [X, T] with deg x Q = 2g + 1 for some 
integer g > 1, and monic in X. For every 7 £ ¥ q we define the curve E~ as 
corresponding to the equation 

<—> Y 2 = Q(X,j). 

We need for our theory that Y 2 = Q(X, 7) is in Weierstrass form, hence without 
affine singularities. Suppose that Eq satisfies this condition, the goal is then to 
compute the zeta function of E^ for certain 'good' 7 e F ? . We remind the 
reader of the rings Z p , Q p , Z g , Q q and C p defined in Section 12.11 We take 
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Q(X, r) € Z g [X, T] such that Q projects modulo p to Q, deg x Q = deg x Q and 
deg r Q = deg T Q. The resultant r(T) := Res x (Q(^, T); Q'(X, T)) satisfies an 
equality 

SO 

r(r) = a(X, T) • T) + 0(X, r) ~ (A:, r), or r - «Q + pQ', 

for some a(r), /3(r) E %q[X, T]. From r(r) we can find the set of good parame- 

ters f , 

S := H £ C p 7 is a Teichmiiller lift and r(j) ^ > . 

The following property is obvious: for 7 £ § the polynomial (Q(A, 7) is squarefree 
and thus determines a hyperelliptic curve in Weierstrass form. We also have 
e § and ord(r( 7 )) = for all 7 e S. 

Definition 3 Let p be the degree of r(F), B := max{deg r a, deg r /?}, Z? := 
max{deg x a, deg x f3} and n := deg r Q. 

It is easy to see that p < Agn and that we can choose a and (3 such that D < 2g 
and B < (Ag — 1)«. 

3.2 The base ring S 

We want as base ring an extension of Q q that includes polynomials in V and 
ensures further on finite dimensionality of a certain quotient module. For r(T) — 
J2i=o r i^ 1 l et p' be the largest index for which ord(r p <) = 0, and define R(T) = 
Sf=o r i^ z - We assume for a moment that p' > 1, see Note|4]for the case p' = 0. 
Define now the following overconvergent ring, equal to Q g [r, r ^y) ^' 



h(T) 



6*(T) e <Q>,[T], deg6 fc (r) < p' and liminf > 



Here the order of a polynomial is the minimum of the orders of its coeffi- 
cients. It is easy to check that l/r(r) G S, and even ^ fc bk(T)r(T) k G S when 
liminf ord(6fc(r))/|fc| > 0. We could call the liminf in the definition the 'radius 
of convergence' of the series s(T). This is inspired by the fact that for series 
J2i a i^ 1 G the radius of convergence is p m with m = liminf ord(a,)/i, 

and such a series is overconvergent if and only if m > 0. We define the valua- 
tion of an element of S as ord(s(T)) := min^ ord(fefc). We can give a more exact 
interpretation of this overconvergence. An element s of S is convergent on some 
open disk strictly bigger than the unit disk, with finitely many smaller closed 
disks removed around the roots of r(T). As all these roots have norm 1, the 
closed disks are a subset of the unit circle. 

We could also define S as the set of all rigid analytic functions s on such an 
'overconvergent domain' U for which s(Q,nW) C Q q . Indeed, the completeness 
of Q q implies then that s is defined over Q q , see [H p. 196]. 
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Note 4 The above definitions assume that p' > 1. If R(T) is a constant, then 
the theory will still work, but with a lot of changes (mostly simplihcations). For 
example the ring S will consist of all overconvergent power series X)fc>o a k^ k ■ 
We will not come back to this situation, as it is always clear which results require 
a reformulation. New proofs are nowhere necessary. 

Lemma 5 (Euclidean division for overconvergent power series) Let f(T) — 
E£o a * r e C p[[ r l] and i5 > 0, e g R such that ord(a t ) > Si + e for all i. Then 
we can find q(T) € C P [[T]], g(T) e C p [r] such that f(T) = q(T)R(T) + g(T) 
with q(T) = J2j bjTi , degg < p' , ord(g) > e and for every j we have ord(bj) > 
S-U + p') + e. 

Proof. It is clear that we can suppose that R is monic and e = 0. Using 
Euclidean division we find for every i integral polynomials qi(T) and gi(T) such 
that diP = qi(T)R(T)+gi(T). The following properties hold: ord(gi), ord(gi) > 
Si and deg qi = i — p'. The polynomials q and g from the lemma are now 
q = Qi an d g = ^2i9i- For determining bj we only need X^<?j for those i 
which satisfy i — p' > j, hence ord(&j) > S(j + p'). ■ 

Lemma 6 The ring S equals the set 



i=0 



at e Q q , bj(T) e Q q [T], degb^T) < p', 



. ord(cn) . ord(bj) 

lim mf — > and lim mf — > 



PROOF. We only need to show that f(T) := a J l with lim inf ord(a,i)/i > 

can be written as YlkLo bk(r)R(T) k with deg 6^. < p' and lim inf ord(6fc)/fc > 0, 
and vice versa. The latter implication is easy, for the former we use Lemma [5j 
We know that for alH > we have ord(dj) > Si + e for some S > and eeK. 
Define the sequence /fc(r) = a iife r l inductively by /o(r) = f(F) and 

/ fc (r) = / fe+1 (r) J R(r) + 6 fc (r) 

as in the previous lemma. We know then that ord(a$ > Si + kSp' + s, and 
hence ord(& fe ) > e + kSp' . As f(T) = J2 k b k (T)R(T) k we find the lemma. ■ 

In order to be able to handle infinite sums of elements in S, we define a 
certain summation condition. 

Definition 7 Let (sk(T))kez be a sequence of elements in S, and Sfc(r) = 
Sfez fl(rP " ^ e define a set of sequences S as follows: (sk)k belongs to S if 
and only if there exist S > and L £ N such that 

mf .,. . ' > S. 

k£Z,\e\>L fc + \£\ 
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The following lemma is immediate. 

Lemma 8 Suppose we have a sequence (sk)k as above. Then (sk)k € S and 
liminffe ord(sk)/\k\ > is equivalent to the following: there exist a constant 
ceQ, and S > such that for all k, £ 

ord(c-b M ) >a.(|fc| + |*|). 

We now want to use the set S to prove convergence of an infinite sum in S. 

Lemma 9 Let (sk)kei. De a sequence in S satisfying liminf ord(sk)/\k\ > 0, 
then the sum ^ fc Sk converges to an element of S. 

Proof. The inequality in the lemma is easily seen to imply the convergence 
of the infinite sum, more precisely, if Sfc(r) = £\ bke(T) / R(F) for every k, the 
sums J2k bki converge. We suppose that the inequalities with c and S hold as 
in Lemma [51 then we have for every £ > 1 that 

. ord(b k e) + ord(c) ord(c) . ord(b k e) 

* - III — \k\T\l\ — - ~W + ™*^T' 

so that liminf^ ord(^ fc bki)/\£\ > S. ■ 

We will apply this lemma in the following technical result, where we write R 
and r for R(T) and r(T). 

Lemma 10 Suppose we have for all integers t > a series St ■= ^2 k ^p^} /(R e ■ 
r k ), a sum over £ e Z and k > 0, where p^j € Q q [T], degp^j < A(k + t) and 
ord(jpfcl) > S(k + t + \£\) for some constants A, S > 0. Then (st)t G S and 

Proof. Fix a positive integer N. It is easy to verify that from r = R mod p it 
follows that R k+N /r k modp^ is an integral polynomial of degree at most pN . 
If we consider st modulo p N , we find that p$ = as soon as k + 1 + \£\ > N/6, 
so if we multiply st with (i? • r) N ^ s we end up with a polynomial. We will now 
bound its degree. The worst possible degree comes from p^J with k + t = N/8 
and I = 0, which gives a degree of no more than AN/S + (p+p')N/S. Combining 
this with the above result for R N + N / S / r N / s we conclude that st modp^ equals 
a polynomial of degree at most AN/ 5 + (p + p')N/S + pN =: c\N divided by 
jj2N/s+N _, ftc 2 N _ jf we expand the numerator as a 'polynomial in R 1 we find 
that St mod p N can only have a nonzero coefficient for R m if m > —C2N and 
m < c\N/p' ~ c 2 N, or |m| < max{c2,ci/p' — 02}^ =: €3^. Note that the 
coefficient of R m also disappears if 5t > N. If we finally write s t — Ylm dm /R m 
withdegd^ < p', we find that ord(d^) > max{|m|/c 3 ; 5t} > \m\/(2c 3 ) + (S/2)t. 
This implies that (s t )t € S and using Lemma [5] we conclude that J2t s t G S. ■ 

It is clear that we can always substitute a Teichmiiller lift 7 from S in a 
series s(T) G S, because R(j) has order 0. We conclude this section with a 
lemma that allows us to derive conclusions from such a substitution. 
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Lemma 11 Let s(T) = J^kez b k (T)/R{T) k G S such that degb k < p' for all 
k. Suppose we have for infinitely many 7 G § that ord(s("f)) > a for some real 
number a, then also for every k G Z we get ord(b k ) > a. 

Proof. After multiplication by a constant we can suppose a = 0. Choose K 
large enough to ensure that ord(&fe) > for \k\ > K . Define the truncated series 

K 

g(T) := £ b k (T)R(T) k , 

k=-K 

so that for all the 7 G S considered we have (7(7) = 7(7) mod p. We continue 
with contraposition. Choose T > such that ord(p T b k ) > for all k, and for at 
least one k we have ord(p T 6j.) = 0. Let — N be the least of these fc's. Consider 
now 

K 

h(T):=p T R(T) N b k (T)R(T) k . 

k =-N 

Then h(T) is nonzero polynomial over ¥ q , but ^(7) = for every considered 
7 G §. This gives the required contradiction. ■ 

We have two important consequences of this lemma. First, it will allow us 
to reduce to the situation described by Kedlaya if we substitute 7 G S. And 
second, it implies that for nonzero s G S it is impossible to become zero in 
infinitely many 7 G §. 



3.3 The "dagger algebra" T 

The Qg-algebra A? (g) Q q used in Kedlaya's algorithm [15] will be replaced by 
an algebra T over the ring S. The following definition of T is motivated by 
two requirements. First, substituting some 7 G § should reduce T(^) to the 
structure <g) Q q considered by Kedlaya. And second, the cohomology module 
TdX/dT should be finite dimensional. Therefore we take for T the overconver- 
gent completion of Q q [T, 1/R(T), X, 1/VQ] as follows: 



Definition 12 

f j 



x i 



;. , -i 1 \ Q 



s ik G S, liminf orc ^ lfc ) > q anc [ ( Slfc ) fc g 5 
k \k\ 



With this definition T is an 5-algebra. An element t(X) of T will also be 
denoted by ^2 k tfc(X, T) / y/Q . A more exact formulation of the second aim for 
T is as follows. Let 7 G § and suppose 7 G V qn for n minimal. Then T with 7 
substituted for T is precisely -A^ (g> Q g « for the curve Y" 2 = Q(A, 7) over Fq^ as 
it appears in [15]. We denote T(j) for the structure we obtain for T with the 
substitution T <- 7, hence in fact T(7) = T ® Q 9 n/(r - 7). 

We first prove that At <g)Q g « C T( 7 ). Choose J^. a ik X l j ^fQ k € ^ ®Q 9 », 
with aifc G Qg» and liminf^ ord(ajfe)/|fc| > 0. Then Qg(7) = Q g ™ implies that we 
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can find for every a,i k a polynomial Si k (T) G Qg[r] with the same order, degree 

smaller than n and 3^(7) = a^. This gives an element ^2 k . sn e (T')X t / i/Q in 
T which reduces to the element we started with. 

The other inclusion is easier. If s(T) G S, then s(7) converges to an element 
of Q g n due to the completeness of Q q ™ . 



3.4 A basis for the quotient module TdX/dT 

A crucial role in the theory of Monsky and Washnitzer is played by the co- 
homology TdX/dT, where d is the differential operator -Jj(dX on T. In the 
classical case the whole construction of the dagger space A< is needed to ensure 
this quotient has the right properties, in particular that it is finite dimensional. 
Define 

fM X j dX . 
V:= { — — — M = 0,...,2. 9 -l,j=0,...,2. 9 



If we substitute some 7 G § in T, this set D is a basis of A^dX/dA^ as in 
Kedlaya's paper [T5]. Here we have a similar result. 

Theorem 13 The S-module TdX/dT is free with basis V. 

The proof falls down in two parts. Showing that I? is a generating set will be 
proven in the following lemma, and to see that D is free we can work as follows. 
Write D = {bi} and suppose there exists a nontrivial relation J^. s;6; = for 
some Si G S, and Sj ^ 0. Now Lemma [TT] gives some 7 G § for which Sjfa) 0, 
and this implies in turn the nonzero relation J^. Sj (7)^(7) = in the classical 
case for A> ® <Q q n for some n, contradicting the result of Kedlaya that the set 
{bi(-f)}i is a basis for A' ® Q g n. ■ 

Lemma 14 T> is a generating set. 

Proof. In order to express Ylk& Bk(X, T)dX/s/ZJ G TdX as an ^-linear 

combination of the elements of the basis T>, we use reduction formulae similar 

k 

to those of Kedlaya in [TS]. We will first show how to reduce each B k dX/ y/Q 
to an expression in the basis T>: 

B k dX ^ (m 

vv bev 

and next we will prove that J^ fc converges to an element of S. 

In Sect ion [3. II we have seen the relation r — aQ + (3Q 1 '. For some numerator 
Bk (X, T) this becomes 

B k = -{aB k )Q + -03B k )Q' =: -{P k Q + R k Q'). 
r r r 

k 

We start by reducing any expression J2 k B k dX/^/Q to a fraction with \[Q or 
Q in the denominator. For k < and k even such a form is simply exact, as 
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integrating does not change the 'overconvergence' property. If k < —1 and odd 
we can write ^/Q 2i+1 as Q e+1 /^fQ, and for k > 2 we calculate d(Rk/\fQ k 2 ), 
which gives 

B k dX (I 2 ,\ dX ( 1 R k \ 

" \V k + JJ—Wr Rk ) ^~ 2 ~ 2d [(k-2) rV Q^ ) ■ (1) 

In both cases, where k < — 1 and odd, or k > 2, repeated application of fT} gives 
that we end for every BkdX/ ^/Q with a polynomial (of degree in X probably 
very high) divided by \fQ or Q. The latter case gives an exact form plus a 
polynomial of degree at most 2g divided by Q, and for the former case we can 
use the following formula for m > 2g: 

X m dX m - 2g X m - X m ^ 2 s +1 )Q 
VQ ~ m-g + 1/2 7^ 

1 (2g + l)X m + X m - 2 aQ' f X m - 2 a^Q \ 

2m-2 ff + l ^m- 5 + l/2y' lj 

At this point we have shown that we can always express BudX/ \/Q h in the basis 
T>, but the question remains whether the sum of these reductions converges in 
the right way. We will go into details only in the most difficult case where 
k is odd and positive. We will often use the following lemma of Kedlaya as 
formulated in j5j Lemma 17.79]. The reduction in this lemma takes place in A* 
as defined in |15j . 

Lemma 15 (Kedlaya) Let h E ^ q [X] be a polynomial of degree < 2g, then 
for m G N the reduction of h(X)Y 2rn+1 dX (resp. h(X)/Y 2m+1 dX ) becomes 
integral upon multiplication by p v with v > \}og p ((2g + l)(m + 1) — 2) J (resp. 
v> Llog p (2m + l)Jj. 

Note 16 By "the reduction" in this lemma, the following is meant. If 

h(X)Y 2m+1 dX = h(X)dX/Y + d(f(X, Y)) 

with degh(X) < 2g and degh(X) < 2g, then there exists a constant c € Q q 
such that both p u h(X) and p v (f(X, Y) — c) are integral. We will always make 
the (allowed) assumption that p" f(X,Y) is integral. 

Moreover, the proof of Kedlaya implies also the following: let h(X) = X k for 
some k < 2g, then with f(X,Y) = fj( x )/ Y2j+1 and deg/j(X) < 2g, we 

have almost always that fj = for j < and j > m. The only exception is 
h(X) = X 2g and m = 0, where also 1/F -1 can have a nonzero coefficient. 

Because of Lemma Qj] we can use Kedlaya's lemma in our theory: if it holds for 
every j € S, it will also hold for T. We continue with the proof of Lemma [Lfl 
Clearly it suffices to consider expressions of the form J2T=i SkX u dX/ ^/Q 2k+1 S 
TdX, with < u < 2g, Sk = ^2 e Ske/R e and all Skt € Qq. From Lemma [5] it 
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follows that we may suppose that ord(s^) > S(k + \£\) for some S > 0. It is easy 
to see that we can multiply with an even bigger constant in order to ensure that 
ovd(ske) > 8(k + \£\) + [log p (2fc + 1)J for a possibly smaller S. Writing 8$ for 
She we use formula ([T]) for i = k, . . . , 1 as 



4l dX _ 4i 1)dx 




where 8$,f$ G S[X]. This gives as result of the reduction process 

— ak+i - _ 2 fe+i - — JTT + d \ 2^ ^2i+i — — TfT + dLpu - 



Remember that B := max{deg r a, deg r /?} and Z? := max{deg x a, deg x /?}. We 
will write deg r for the degree in T of the numerator in the following paragraphs. 

Let us write r k ~ l fj$ = f + f%Q + • ■ • + JaQ A where deg^ fj < 2g and the 
fj and A depend on k, I and i. As A < ((fc — i)D + u)/(2g + l)<fc — i + lwc 
may take A = k — i + 1 . We also see that 

deg r ft < {j + 1)k + deg r f$ < (j + 1)k + (k- i)B. 

Write fki = J2t^k£ /v f Q 2t+1 1 then the notes after Lemma fT5l imply that we 
can limit ourselves to < t < k. Indeed, the tpfl with t < will disappear 
when the reduction is finished, i.e. after applying formula ([2|). The coefficient 
of l/y^Q 2t+1 in ipu is 



(puj '■= (fj fr° m the expansion of r k * Vfel + ^)) I rk ~ 



where j ranges from to \_(k — t + 1)/2J. Indeed, the largest i for which 
fj$ can appear in this sum satisfies Q A=k ~ i+1 / \/Q 2 ' +1 = \j\fQ * + , hence 
i = (fc + t + l)/2. We see that deg r puj < {j + 1)k + (k - i)B < (2k + B)k, as 
i < k and < t < k. Lemma IT51 gives that 



ord I J^Pktj I > ord(s M ) - Llog P (2fc + 1)J > 5(k + \£\) > St. (3) 



The coefficient of l/(R l ^/Q 2t+1 ) in ipke is then J2kjPktj/ rk 1 * where k > t 
and < j < (k— t+l)/2. In this sum we look at the terms for which l/r k ~ t ~^ = 
1/r for K > 0. This means that we limit ourselves to fe = K + t, . . . , 2K+t+l, 
and at the same time j increases from to K — [(k — £+l)/2j . We conclude that 
we have a sum s t — J2ik °f polynomials of degree at most (2k + B)(2K + t + 1) 
with valuation at least 5(K + 1 + \£\) divided by R r K . Lemma [TOl now implies 
that (s t )t € S and hence J2ke <Pk£ € T. 
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Now we consider expressing s ki itself in the basis V. For this we must reduce 
the s k °) further with formula ([2]) in order to obtain an expression of deg^ < 2g. 
As explained above, the appearing differentials are not relevant anymore. We 
have that deg r (r fe • s£ ) < kB. Every time when we reduce a power of X in 

sfg, the degree in T increases with at most k. If we write the reductions as s' k ^ 
this implies 

deg r (r fe s^) < k(B + Dk) + nu. 

Lemma H~5l implies again that ord(s^) > S(k + \£\), and Lemma [TOl gives the 
required result s := J^ki s 'ki e S[X] <2g . 

With a similar but easier argument we can find the same result for the 
reduction of J2kLi BkdX/Q k and J2kLi BkVQ 2k 1 dX. We refer to [T2] for a 
complete proof. ■ 



3.5 The construction of a Probenius lift 

The purpose of this section is to construct apth power Frobenius F p on S and T. 
We remind the reader of the map a : Q g — > Q q , the Frobenius automorphism. 
The definition of F p on S and the module of differentials SdT is obvious with 
the following relation^ and er-linearity: 

r^r p , — -J—- 1 — > — rfr^pr^ 1 ^. 
R(r) i? ff (rp)' y 

The fact that R a (TP) - R(T)P = mod p gives that l/R a (TP) £ S. The defini- 
tion on T and TdX is more complicated, namely 

X 1 ^ X p , dX i-> pX^dX, 

Q(X,T)P -Q a (XP,TP) 



y/Q(X,T)^Q(X,T)^ • 1 



Q(X,T)p 



2 



It is easy to check that this definition implies yF p U/Q(X, T)j J = Q a (X p , T p ). 

A tedious but not so hard computation shows that F P (^/Q(X, T)) is in T, we 
omit it here. A detailed proof can be found in the author's forthcoming Ph.D. 
thesis [12] . 

It is clear that for a concrete 7 € S these definitions boil down to those on 
Q q n and A t ® Q 9 „ . 

3.6 The differential equation 

The introduction of the parameter T has as its most important consequence the 
existence of the connection V := Jrrff on T. Consider the following cube of 



5 It is the definition r 1— > T p that forces us to take Teichmiiller lifts for the substitution T <— 
7. Indeed, denote by 'subs' this substitution, then we require subs(Frob(r)) = Frob(subs(r)). 
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modules and morphisms: 



TdT 



TdXdT 



T 



TdX 



TdT 



TdXdT 



T 



TdX 



It is easy to see that all faces of this cube commute. Define Hmw '■= 
TdX/dT, the first cohomology of Monsky and Washnitzer, then the above dia- 
gram implies the following commutative diagram: 



Hmw 
Hmw 



Hmw<IT 
HMwdT. 



(4) 



The hyperelliptic involution i : T — > T : X i— > X, vQ *— > — \[Q splits the module 
T in eigenspaces T+ with eigenvalue 1 and T_ with eigenvalue — 1. It is clear 
that F p o % = i o F p and hence F p (T_ ) C T_ , and as Kedlaya points out in [TS] , 
we may restrict ourselves to T_. The differential operators V and d commute 
with i as well, so we can restrict @ to H MW := T_dX/d(T_). From the proof 
of Lemma [H follows that B = {X l dX/^/Q \ i = 0, . . . , 2g - 1} = {&;}, where 
b l : = X l dX/^/Q, is a basis for this free ^-module. Write F(T) = (Fa) and 
G(T) = (Ga) for the matrices of F p respectively V (with entries in S), more 
precisely 

2g-l 2g-l 

F P (bi) = J2 F ^ and V ( 6 ») = H G u b e dT. 
Note that F p and V are not S-linear: F p (J2i s ih) — J2i Fp( s i)Fb(bi) and 



SiVh dr. 



Using the relation V o F p = F p o V on basis elements, we easily deduce the 
differential equation 

■H=F(T) + F(T) ■ G(T) = pT"- 1 ■ G a (T p ) ■ F(T). 
oT 



15 



In order to solve this equation we can work as follows. Remember that we as- 
sumed that r — gives a situation that can be handled by Kedlaya's algorithm, 
or more precisely we can compute F(0) very fast as the curve Eq is defined over 
the small field ¥ q . A first step is to solve the equation V = locally at the 
origin. This means that we compute a matrix C(T) over Q g [[r]] which satisfies 
^C(r) + C(T)G(T) = and C*(0) = I. Indeed, with C(T) = (cu) the vec- 
tors Vi := J2e cube form then a basis of the solutions of V = 0. Next we apply 
VoFp = F p o V to these local solutions Vi in order to find that the F p (vi) are also 
solutions around zero of V = 0. Hence we know that their matrix C" T (r p ) • F(T) 
equals A ■ C(T) for some constant matrix A. Comparing these matrices in T = 
yields the equality 

F(T) = (C^r^r 1 • F(0) ■ C(T). 

At this point we have computed F(T) as matrix over Q g [[r]], with entries which 
do not necessarily converge in Teichmiiller lifts 7. However, we know that F(T) 
is defined over S, hence in a last step we have to recover this representation as 
a matrix with entries in S. 

4 The algorithm 

We now give a detailed exposition of the algorithm. Everything needed for 
a concrete implementation will be explained, either explicitly or by reference, 
except the handling of the p-adic fields and the action of Frobenius on them, 
which we postpone to the complexity analysis, Section [6. II 

Input. A prime number p > 3, the field ¥ q with q = p a and represented as 
¥ p [x]/x(x), 7 and the field ¥ q ('f) represented as F 9 >» = ¥ q [y]/{p(y) of extension 
degree n over ¥ q . A polynomial Q(X,T) over ¥ q , monic in X of degree 2g + 1 
where g > 1, such that Q(X, 0) and Q(X, 7) are both squarefree. 
Output. The zeta function of the smooth completion of the curve Y 2 = 

Q(x,j). 

Step 1 Some preliminary computations. 

Let x( x ) be a naive lift to characteristic zero of x(x), such that the coefficients of 
X(x) are small integers, e.g. from the set {— 2^-, . . . , ^j^}, and so that degx = 
degx- We represent Q q as Q p [x]/x(x). Lift Q to Q by lifting its coefficients 
in the same way, and set re :— deg r Q. Define now the following constants: 
V ■= \2g logoff + 9], 



N := \nga/2 + (2g + 1) log p 2] , N 8 
N b := N + N 8 , N r 
M := p(2N b + 4) + (p - l)/2, N 3 



:= an [\og p (g) + 2\ + [2gan{\og p g + 3)J , 
:= (2AT 6 + 5)(8g + 2)rep+l, 
:= (277 +l)riog p iVrl, 



N 4 := [log 2 (7Vr)?7 (2 [log p (iVr/p)l + ITogpJVrl)! . 
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and finally we have 

N 6 :=3r)\log p Nrl N a := N b + N 3 + N 4 + N 6 . 

We work always modulo T Nr . Moreover, in steps 2 to 6 we work modulo 
p Na and in the last two steps modulo p Nb . Nq is the precision needed in order 
to recover the zeta function correctly, and Ni for i g {3, 4, 6, 7} is the precision 
lost in step i. Compute a(X, T), (3(X,T) and the resultant r(T) such that 
r(r) = a(X, T) • Q(X, T) + (3(X, T) • Q'(X, T). This is a matter of simple linear 
algebra, see e.g. Section 6.3 in [35]. We note that we need the representation of 
a and j3 in their classical form in order to obtain the lowest degree in X. Define 
finally C :— max{deg r a, deg r /?} and D := max{deg x a, deg x /?}. 

Step 2 The matrix G(T) = H(T)/r(T) of the connection. 

We compute the matrix H(T) as explained in the proof of Proposition [TBI 

Step 3 The local solutions C. 

The matrix C(T) satisfies the condition C(0) = I and C + C ■ G = 0, which 

becomes r-C = -C H. Write H = J2Lo H i ri and C = EilV 1 C ^ where the 
Hi and d are matrices over Q q and Co = I. With r(T) = r p T p + • • • + r\T + ro 
we obtain 

Nr—l /min{p,fe} \ N r -2 / min{h,k} \ 

E E ( fc - Or^fc-i r fc - x = E E ■ H * Tk - 

fc=l \ i=0 J fc=0 V i=0 / 

Proceeding recursively, the resulting formulae are (this proves at once the unique- 
ness of C) 

G\ = CoHq, similar C2, . . . ; for k > p and k > h we have 

Ck = — (—Ck-iHo - ■ ■ ■ - Ck-h-iHh —ri(k— l)Cfc-i - ■ • ■ 
kr 

-r p (k - p)C k _ p ) . 

Step 4 (C CT (rP)) _1 . 

First compute C a (T p ), then invert the resulting matrix using quadratically con- 
vergent Newton approximation as in [22j Section 5.2.2]. 

Step 5 Frobenius for the T = case. 

Compute F(0) with Kedlaya's algorithm, but with the (higher) p-adic precision 
N a . 

Step 6 Frobenius for the general case. 

Calculate first F(T) = (C" 7 ^)) -1 ^^)^^) modulo T Nr , and then r(r) M F(r) 
modulo T Nr . 

We now switch to the precision p Nb . 
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Step 7 Frobenius for the concrete 7. 

Compute (e.g. as in [28j ) the minimal polynomial <p(y) of 7 over W qt and write 
Fgn as F q [y]/(p(y). Lift <p(y) to ip(y) over Q q such that Q gn = Qa[y]/(f(y) and 
7 = ?/ is a Teichmiiller lift as explained in Section 16.11 This triclo allows us to 
evaluate an entry g(T)/r(T) M of F(T) by simply reducing 3(7) = g(y) modulo 
ip(y), and then multiplying with (l/r(y)) M modulo tp(y). All these calculations 
can be done very quickly as explained in [2]. 

Step 8 The zeta function. 
The matrix T of F£ ™ equals 

T := F^Y™- 1 ■ F( 7 )^"" 2 • • • F(jY ■ F( 7 ). 

This product can be calculated as in [15]: M 1 := F(7) CT • ^(7), M 2 := Mf 2 • Mi, 
M 3 := M 2 • M 2 etc, and taking the product of those M, implied by the binary 
expansion of an. As shown by Kedlaya in [T5J, the zeta function Z(t) of (the 
smooth completion of) the hyperelliptic curve Ey : Y 2 = Q(X,j) is given by 
Z(t) = det (/ - Ft) (1 - - q an t)~ l , and using Newton's formula 

-E Tr ^ fe )— J > 

we compute det(7— Ft) modulo p N °. As Kedlaya showed in [T5J, each coefficient 
of a>i of this polynomial satisfies | o.^ | < 2 29 p ang / 2 , which allows us finally to 
recover the zeta function. 



5 Proof of correctness 

5.1 The behavior of the matrix of Frobenius 

In order to determine the required accuracy in our algorithm, we need precise 
estimates on the convergence rate of the entries of the matrix F(T). This will 
be investigated in the following proposition. Define fi := (pg — 4)/2. 

Proposition 17 Let N E N and f(T) an entry of F(T), reduced modulo p N . If 
N > fx, then with X i = p(2N + 4) + (p - l)/2 and X 2 = (2N + 5)(8g + 2)np + 1 
we have that r(r) Xl /(r) is a polynomial of degree at most \2- Moreover, 
ord(/(r))>-(log p5 + 2). 

Proof. The matrix F(T) is obtained by computing Frobenius on a basis ele- 
ment X k I \f~Q and then reducing this result using formulae ([1]) and 

6 This trick was suggested to us by Alan G.B. Lauder, personal communication. 
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At this point we know that ord(-Bi) > i, deg x B t < kp + i((2g + l)p — 1) and 
deg r Bi < npi. Let on :— pi + In a first stage we use formula J]} as often 
as needed in order to find B[l \[Q as reduction of Bi/Q ai : 

Formula |l| has to be used on times for Bi, hence we find 

deg x B[ < deg x Bi + a.D; deg r (r tt ' B[) < deg r B t + a,B. 

The next step consists of applying formula in order to decrease the degree 
in X. If we write the result as B'J^/QdX = B'{/s/QdX with deg x B" < 2g, 
then 

degrCr-'Bj') < degr^flj) + (deg x (r«*flj) - (2g - l))n. 

At this point we only need to know which B" will not be zero modulo p N . Using 
Euclidean division we can write B t (X, F) = U(X, T)Q(X, T) ai+1 + s l {X, T), with 
deg x s(X, T) < (otj + l)(2g + 1). This implies that 

where writing the second term as Y^j s v I VQ ° + would give Sjj = for j < 0. 
It is easily seen that for such an expression Lemma [15] remains true, regardless 
of the condition degh(X) < 2g, and hence for the valuation of the reduction of 
s l /Q a ^+ 1 ' 2 we find at least 

i - Llog p (2a, + 1)J > i - (1 + \og p (2i + 1)) > i/2 - 2. (6) 

This means we can confine ourselves to those i for which i/2 — 2 < N, or 
i < 2N + 4. The resulting denominator r ai gives then xi = o<2jv+4, and the 
degree in T of the numerator satisfies 

deg r r a2N+i B% N+4 < (2N + 4)(2«p + 8gnp) + 5gnp =: X 2 - 1, 

by a tedious but trivial calculation. We do have however not yet taken the 
reduction of ti^/Q in (O into account. We will show that the condition N > /j, 
ensures that this contribution can be ignored. It is immediately clear that as 
k<2g-l, 

deg x U < deg x #i - (2ff + l)(a< + 1) < pg - i. (7) 

If ti 7^ then i < pg by ([7]), and as we required that 27V + 4 > pg in the theorem, 
these ti will not be responsible for a bigger degree in F than X2 — 1- 

In order to show that ord(/(F)) > — (log p (g) + 2) we have to do a little more 
work. The inequality certainly follows from (J6j) for the part Si/Q ai+1 ^ 2 in ((5]) 
above, but for Uy/Q this is not so clear. Writing t, = Uq + tnQ + ■ ■ ■ + ti^Q Ai 
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we find that A j < ^ s g+ \ ■ Lemma [15] implies then that the valuation of the 
reduction of ti^/Q is dominated by the possible value for tiAiQ Ai , being 

i- Llog p ((2 5 +l)(A i /2 + l))-2j >-(log pff + 2), 

as can be checked directly. ■ 



5.2 Estimates on H and C 

Proposition 18 The matrix H — rG becomes integral after multiplying by 

lOg 

pp- 1 and consists of polynomials in T of degree at most 8gn. 
Proof. The entries (rGu) of H are obtained by computing 

r • V = -__L_3g_E s _l(P + 2^) 

where P := X l aQ and R := X l f3Q and we have used formula (JTJ. This has to 
be reduced further with formula We have that 

deg x (P + 2i?') < 2g - 1 + D + 2g < 6g - 1, and 
deg r (P + 2R') < B + k-1< Agn. 

Each reduction of X m dXj y/Q using (fj)) increases deg r by at most n, decreases 
deg x by at least 1, and introduces a denominator 2m — 2g + 1. Together this 
gives a denominator rim=6g-i( 2TO— 2 .g + 1), easily seen to be a divisor of (10g)!, 
with order at most . The degree in T is obtained by adding at most Ag — 1 
times k to 4gre. ■ 

We note that if we would use Lemma IT5| the above estimate could be improved, 
but this would not have any influence on the complexity estimates in the end. 

In order to control the valuation of the entries of C, we need bounds for F(T) 
and P(0) _1 . The former was obtained in Proposition [T71 and for the latter we 
have the following lemma. 

Lemma 19 The matrix P(r) _1 is also defined over S, and if ord(F(T)) > e 
for some real number e, then ord(F(T)^ 1 ) > (2g — l)e — g. 

PROOF. Define d(T) := det(F(r)), then clearly d(T) £ S. Let 7 £ § and choose 
m such that 7 £ F p m, then the Weil conjectures imply that 



i=0 



which gives immediately that ord(rf(7)) = g. Lemma 1111 gives then the same 
valuation for d(T). From the following lemma we can see that <i(r) -1 £ S, and 
it is clear that ord(d(F) _1 ) = —g. From linear algebra we know that an entry 
of a(r) _1 is a minor of a(T) multiplied by d(r) _1 , which concludes the proof. 
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Lemma 20 Let s € S and e £ M. such that for all 7 £ § we have ord(s(j)) = e. 
Then 1/seS. 

Proof. Multiplying s with a constant if necessary we may suppose that e = 0, 
and hence ord(s(T)) = as well. Let if be a finite extension field of Q q such 
that i?(T) splits completely in K. If we can prove that 1/s 6 S <£> K, then 
trivially also 1/s 6 5. 

We write s(r) = £ fceZ M r )-R( r ) fc and choose (5 > such that ord(6 fe (r)) > 
S ■ \k\ for k > 0. This implies that & fc (T) = mod p SN for |fe| > N and TV > 0. 
From now on we only consider TV big enough for these inequalities to hold. 
Let s N (T) := (s(T) modp 5N ), then it is clear that s^ } (T) := R(T) N s N (T) is a 
polynomial of degree less than 2p'N. Let 1Z be the set of roots of R(T) in K. 
As ord(s^-'(7)) = for all 7 G S, there exists a factorization (with ceZ g x ) 

a$ (r) = c JJ (T - a;) 4 - mod p, £ 4 < 2p'JV. 

If we define the integral polynomial s^(r) := R(T) 2p ' N / Yl(T — x) lal , then we 
can also find a rational number t > and an integral polynomial s$ (T) such 
that 

4\r)=c^^-p^(T). 
s N ( r ) 

Inverting sn(T) and substituting the above expressions yields 

Mr)-4>(r)- c fl(r) Sw(r) ^ l P J 

Now working as at the end of the proof of Lemma ITD1 we see that (1/sn)n 
converges to an element of S. I 

In step 3 of the algorithm we compute C(T) = C^ k using an expression 
of the form Ck = v( - ■ ■ )■ With a trivial argument this yields an C(fc)-bound 
for — ord(Cfc). It is however possible to do much better if we use Dwork's trick. 
This works as follows: suppose C(T) converges on a certain disk with radius 
e < 1. Then as C(T) = F(0)- 1 C" T (rP)F(r), C will also converge in the disk 
with radius tfe. Repeating this process leads to convergence on the open unit 
disk, and if we keep track of the orders we find even an explicit bound, as proven 
in the following proposition. 

Proposition 21 Write C(T) — J2T=o CkT k , then we have 

ord(C k ) > -riog p (fc + l)l(2 ff log p ( ff )+ 5 ). 

Proof. As just mentioned we use the deformation relation for F(T). Write a, 
(3 for lower bounds for the order of F(r), .F(0) _1 . If we look at the coefficients 
of r° up to TP~ X (as matrices), we see that 

ord(C fe ) > ord(CJ) + a + 
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for k = 0, . . . ,p — 1, where ord(Co ) = ord(Co) = 0. Repeating this gives 
ord(Cfc) > fnaxordfCT,) + a + (3 

3=0 

for k — p, . . . ,p 2 — 1, so e.g. ord(C p 2„ 1 ) > 2(q + (3). Pushing this further gives 
in general 

ord(Cfc) > \\og p (k+l)](a + p). 

We have seen that we can take a = — (log (g) + 2) and (3 — — {2g — l)(log p (g) + 
2) — g, which proves the proposition. ■ 

By a similar calculation we find with (C 7 (TP)) -1 = J2k>o 

D k T k that 

ord(^ fe ) > -riog p (fc/p+ 1)K2 5 log p .g + .g). 

5.3 Proof of correctness 

In this section we will prove that the chosen values for the TV; and M in the 
algorithm are sufficient in order to find the correct result. First of all, just as 
in Kedlaya's paper [TS] it suffices to compute the zeta function modulo p N " in 
order to determine it exactly. During the algorithm every multiplication of non 
integral p-adic numbers can generate a decrease in the accuracy. We will handle 
this problem by using Lemma 22 from [22 that says that the introduced error in 
the multiplication of two matrices with negative p-orders — x and — y is at most 
x + y. We will show that the loss in accuracy in step i is iVj for i £ {3, 4, 6, 8}. 

The fact that we have to compute ^(7) with precision Nt, = N + N s can 
be seen as follows. The error that could be introduced while computing the 
product of the matrices F(^) a * has valuation at most an(log p g + 2), and com- 
puting det(I — Tt) cannot remove more than + log p (2g) + 2gan(\og p g + 
2) < 2 gan (log p g + 3) of the resulting accuracy. Indeed, the worst appear- 
ing denominators in Newton's formula are (2#)! and 2g, and ord(Tr(J r2ff )) > 
—2gan(\og p g + 2). Proposition [T71 shows that the chosen N-p and M suffice for 
step 7, and Lemma [H will give 7V 3 . In the formula from 22\ for (C^rP))" 1 
we have to calculate products of the form D^CDk where is the &th ap- 
proximation of (C" T (r p ))~ 1 , obtained during the fcth iteration of the Newton 
approximation. This has to be done log 2 Nr times, hence we get N4. We can 
use the estimates for C, F(0) and (C (7 (T P ))~ 1 for step [HI which is dominated 
by JV 6 . 

A naive estimate using the formulae in step 3 would give a huge bound on the 
loss in precision, but the following lemma shows that we can do better. 

Lemma 22 Denoting the exact solution of V = by C and our computed 
solution modulo p N " by C we find for E = Y.k>o E k^ k ■= P~ Na (C - C) that 

ord(E k ) > -(4 5 log p (.g) + 2g + l)[log p (fc + 1)]. 
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Proof. This lemma is from [T5], but we give a slightly different proof. 
Truncating the right hand side in the computation of C modulo p Na in step 3 

implies that C is a solution of rC + CH = p Na £ with £ an integral matrix. We 
choose a matrix K such that E + KC = C. Then as VC = and H = rG = 
-rC~ l C we see that -VE = -(VC - V{KC)) = KC + KC + KCG = KC 
on the one hand, and on the other hand — VE = —p Na (WC — VC) = -£, As a 
consequence we have KC = -£, or K = J -£C~ 1 dT + constant matrix. If we 
substitute T — 0, we see that this constant matrix will be the identity matrix. 
Hence we can conclude with 

E = -(l\ eC ~ XdS ) C ' 

Recall that r\ — 2g\og p (g) + g, then the remark after Proposition I2T1 implies for 
C _1 the bound 

ord((C- 1 ) fc ) > -7,riog„(fc + 1)1- 

The power series in the matrix ^£ are integral, and integrating the coefficient 
of T fc adds at most an order of log p k. Finally we multiply with C, so that 
ovd(E k )> -(27 ? + l)riog p (A:+l)l. ■ 

Remark that in fact we can prove more: let D be a matrix such that D mod T = 
and WD is integral. Then D will satisfy the bounds for E in the lemma. 



6 Complexity analysis 

All the time estimates in this section are measured in bit operations, and mem- 
ory requirements are expressed in bit space. 



6.1 p-Adic computations 

In this section we work modulo p v for some positive integer and we assume 
that every p-adic number has valuation at least —0(y). Let Q q — Q p [x]/x(x) 
as in step [1] then the memory requirements for an element of Q q are av log 2 p, 
or O(av) if we ignore the dependency on p. As pointed out in [2], computing in 
such a quotient ring is essentially linear in the element size, hence 0{av). 

Let us first consider calculating a power of Frobenius a k with k = 0(a) of 
an element a(x) of Q g ; we can suppose a{x) S 7L q as a acts trivially on Q p . 
We start with computing cr k (x), and in the next step a k (a(x)) = a(a k (x)). 
We have that a k (x) = x p mod p and x{° k { x )) = 0- First compute x pk modp 
by repeated squaring (this requires O(k) arithmetic operations in F 9 ). It fol- 
lows now from the irreducibility of x that x'( xP ) ?= modp, so we can start 
quadratic Newton approximation. The total cost for a k (a(x)) is C(a 2 i/), if we 
evaluate a at a k {x) in a (naive) efficient way as with Horner's method, using 
0(a) operations in 7L q modp". The memory cost is just 0(av). 
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In step [7| we need a representation of Q q n as Q q [y]/f(y) such that y = 7, 
a Teichmiiller lift of 7. As described by Vercauteren [SJ Section 12.1.2] we can 
compute in time 0(anv) a polynomial f(z) — called a Teichmiiller modulus 
- over Q p such that Q g n — Q p [z]/f(z) and z = 7. Here /, the reduction of 
/ modulo p, is the minimal polynomial of 7 over F p (to be computed in time 
0(a 2 n 2 ) as in [28]). Now we know that as <p(z) = 0, we can split / over Q q as 
ip ■ if' for some polynomial ip' such that (p and ip' are relatively prime. As it is 
easy to compute <p and ip' , again using [28j . we can use classical Hensel lifting 
[341 Section 15.4] for computing ip in time O(anv). 

Finally we have to consider the action of Frobenius a k on Q q ™ . The depen- 
dency on n of this step will dominate by far the overall complexity, hence any im- 
provement here will immediately give better algorithms. Assuming k — O(an), 
we start again with computing <r k {y). A naive approach would be to reduce 

k 

y p modulo <p(y), but we can do this in a better way. Namely, compute in time 
0{a 2 n 2 ) the projection y p in ¥ q ™ = ¥ q [y\/ (p{y). Now we know that cr k (y) is 
also a Teichmiiller lift, and Section 12.8.1 of [5] shows how to compute cr k {y) in 
time Oianv). For an element ce(y) £ Q<j™, we can compute <j k {a) in two dif- 
ferent ways, resulting in the two complexity estimates of Theorem [TJ We may 
assume that a(y) £ Z ? n. Compute a a (y) in time 0(na 2 v), which is the degree 
n — 1 polynomial over Z 9 obtained by applying a k mod a to the coefficients of a. 
Using Horner's method we obtain a ak (<J k (y)) modulo ip{y) in time 0(n(anv)) 
with memory requirements 0{anv). All together we can thus compute a k on 
an element of Q q n in time (D(n 2 a 2 v) and space 0(anv). A second method is 
more complicated but gives the faster algorithm of Theorem [TJ We will treat 
it in the following section, where ([9]) gives the fastest result, and (fTTj) gives for 
each b £ [0, 0.5] a trade-off between required memory and time. 

6.2 Fast modular composition of polynomials 

The following exposition is based on Section 12.2 of [33], but we have added a 
certain trade-off in time versus memory. 

Write g(y) := a a (y) and r](y) := a k (y), then these are polynomials over Z 9 
of degree at most n — 1, whereas tp{y) has degree n. Our goal is to compute 
g(v(y)) mod <p(y). 

Let b £ (0,0.5] and define m := \n b ~\ and m' := \njm\ w n 1_b . We proceed 
in a number of steps. 

Step 1 We compute (trivially) polynomials go(y), ■ ■ ■ ,9( m '-i){y) over Z g of de- 
gree at most m — 1 such that 

g(y) = 9o(y) + gi(y)y m + ■■■ + ^-i(y)y (m '" 1)m 

and compute rj{yY mod ip(y) for i = 0, . . . , m — 1. This can be achieved in time 
Oimanv) and space 0{manv). 
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Step 2 For a polynomial f(y) we denote by [f(y)] the row matrix constructed 
from its coefficients (if necessary padded with zeroes), e.g. [y+2y 3 ] = (0 1 2). 
Consider the following product of an m! x m with an m x n matrix over Z g : 



/ My)] \ 
ln(y)} 



\[r m >-i(y)}J 



( \so(y)] \ ( Hvf] \ 
My)] Hv) 1 ] 



Vbm'-i(v)]/ V Mv) 



\m-ll 



(8) 



Continue now with step 3 or step 4. 

Step 3 Taking 6 = 0. 5, m«m'« -^/ri we have to compute in fS} the product 
of an m x m and an m x to 2 matrix, and as proven in Section 5] this can 
be done using C(m 3 ' 334 ) operations in Z q . This yields a time complexity of 
0(n 1 ' av), and the memory requirements for (jSJ are 0(mnav) = 0(n 
We can conclude that it is possible to compute a k on an element of 
p v in 

d{n lm7 a 2 v) time and C(n 15 a^) space. 



1.5- 



Ddulo 



(9) 



Step 4 If we do not compute the product ([5]) at once, we can gain some mem- 
ory. Note that representing the result of the product in © requires already 
0{m!nav) bits of space. We divide the matrix formed by the polynomials 
gi{y) into [m'/rn] matrices of size m x m: the first one, say Go, is given by 
{9o(y), ■ ■ ■ ,9m-i(y)}, and generally G k is given by {gkm(y), ■ ■ ■ , fffcm+m-i (?/)}■ 
The corresponding matrix products are computed one by one, schematically 



Define 1 ) := and compute for every k 



g (k) ._ g (k-i) + r mk+l (y) ■ My) m Y mod <p(y). 

i=0 



(10) 



This has to be alternated with the computation of the relevant matrix products 
above. At the end we have computed g^ m / m 1 -1 ) = g(r)(y)) mod <p(y), but used 
only 

Oimanv) = 0{n 1+b av) 

bits of memory, the requirement for each single matrix product. It is clear that 
(|10P requires time Oirnanv) and has to be done approximately m'/m times, re- 
sulting in 0(m! av) = 0{n 2 ~ b av). We will now investigate the time requirements 
for the matrix products. It is easy to verify that the product of an m x m with 
an mxm c for some c > 1 takes no more time than C(m w (m c /m)) = 0(m c ~ 1+UJ ) 
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operations in Z g . We have to do this approximately m! /m times, and as c = 1/6 
we find 

(mi- 1+u — au) = 0{n 2+ ^ b av) = O(n 2 ~ 0mAb av). 
m 

We note that for b = 0.5 this gives 0(n 1688 az/), precisely the same result as in 
[341 Corollary 12.5]. 

Combining these results with the 0(a 2 n 2 ) computation of y p in Section |6~T1 
we can compute a a (a k (y)) in 

time 0{n 2 -°- 624b a 2 iy) and space 0{n 1+b av). (11) 
6.3 The complexity of the algorithm 

We will use the following bounds: iVo = 0(nga), Ng, Nb and M are all 
0(nga log g), Nr — 0(ng 2 an\ogg), N a = Oinganlog 3 g) and the last constants 
N$, N4 and Nq are O(g). For example, representing an clement of Q q requires 
then 0(aN a ) = 0(nga 2 log 3 g) bits. We recall that we work modulo T Nr , in 
steps 1 to 6 modulo p Na and in steps 7 and 8 modulo p Nb . 

Step 1 We can compute r, a and /3 with classical Gaussian elimination, which 
results in time 0(g 3 ■ gn ■ aN a ). Here we use an 0(gK) bound for the resulting 
degree in T. The net result is 0(ng 5 a 2 K 2 ). We could gain one factor g by using 
more sophisticated methods for computing the resultant as e.g. in [25 . 

Step 2 We have to apply formula ([2|) O(g) times, and each time this requires 
0(g 2 naN a ) calculations. Indeed, we can work with polynomials of degj^ < O(g) 
and deg r < 0(gn) as proven in Proposition 1181 As we have 2g basis elements 
to consider, this gives together 0(ng 5 a 2 K 2 ). 

Step 3 The computation of a single Ck consists of at most 0(gn + p) matrix 
products and such a product requires 0{g u 'aN a ) time. Here lo is chosen bigger 
than the exponent for matrix multiplication as defined in [31] , and we can take 
uj = 2.376. The number of CVs to be computed is Nr, so an overall time 
complexity is d(g 1+UJ anN a N r ) = d(n 2 g i+UJ a 3 K 3 ). 

The memory requirements for representing C dominate this step, and are 
0(aN a N T g 2 ) = 0(n 2 g 5 a 3 K 2 log 3 g). Note that for this last result we have to 
use Proposition [21] to ensure that there appear no denominators that are too 
big. Steps 4 and 5 require certainly less memory space than this step. 

Step 4 The calculation of C a {T p ) needs time 0{g 2 aN a N T ), and the inversion 
costs 0{g ul aN a N T ). 

Step 5 If we look at the analysis of Kedlaya and take into account the higher 
accuracy JV"„, we find a time complexity of 0(g 2 N 2 ) = 0(g 4 a 2 n 2 ). 
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Step 6 Computing F(T) as a matrix of power series needs 0{g u) aN a Nr) time, 
and requires memory space of size 0(g 2 aN a Nr). Multiplying F(T) and r M 
requires time 0(g 2 aN a Nr)- 

Step 7 This step is explained in more detail in Section [6.11 To compute <p we 
can use Shoup's algorithm which requires 0(ay/n + n 2 ) time, and this can be 
ignored in the global complexity. To determine ip we need 0(anNb) of time, 
and to reduce g{y) modulo tp(y) takes O(aNbNp) time due to [2]. Multiplying 
by r(-f)~ M is negligible. As we have to do all this for the entire matrix, this 
step costs 0(n 2 g 5 a 3 k) and the size of the resulting matrix — with entries in 
(V — is 0{anN b g 2 ) = 0(n 2 g 3 a 2 logg). 

Step 8 Computing a 0(an)th power of Frobenius as explained in the previ- 
ous section costs time 0(n 2 ' 667 ga 3 ) respectively 0(n 3 ga 3 ) with memory re- 
quirements of 0(n 2 5 ga 2 log g) respectively 0(n 2 ga 2 log g). For determining 
the product of the matrices we use fast matrix multiplication, resulting in 
time Olg^anNb) = (D(n 2 g 1+UJ a 2 ). Finally we compute the determinant us- 
ing Newton's formula, for which we have to compute J 7 , J 72 , . . . 1 T 29 in time 
<D{g 1+UJ anN b ) = 0{n 2 g 2+UJ a 2 ). 

Taking the maximum of all the above requirements we find immediately 
TheoremQ] For the second theorem we only need to keep F(T) modulo p Nb and 
to use steps [7] and [H so this theorem follows as well. The following picture gives 
a graphical overview of the time/memory requirements for our and Kedlaya's 
algorithm, where the full line indicates the trade-off explained in Section 16.21 
The values on the axes are to be seen as exponents for n in the complexity. 

MEMORY 



2 







Kedlaya 




Deformation 










Deformation 











2 2. 2 66 6 7 88 3 TIME 



7 Remarks 

7.1 The elliptic curve case 

For elliptic curves the above results are particularly interesting. It is well known 
that any elliptic curve in characteristic unequal to 2 can be described up to 
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isomorphism by an equation 

Y 2 = X ■ (X - 1) ■ (X - A) 

for some parameter A; this is called the Legendre normal form. Moreover, if the 
original curve is defined over F q and has F 9 -rational 2-torsion, then A will be in 
F q as well. If we define Q(X, T) := X ■ {X - 1) • (X - T + 1) over F p , we can hence 
reach every elliptic curve with rational 2-torsion with our deformation theory, 
and this over the prime field (a = 1) and with a linear deformation (k = 1). 
Theorem Q] gives then that we can compute the zeta function of an elliptic curve 
over F p n in time 0(n 3 ) and space 0(n 2 ) or time C(n 2 - 667 ) and space 0(n 2 5 ). 
More details can be found in Section [870 

If we compare this with Harley's result [10], we see that his algorithm has 
the same space complexity, but the time complexity is better, 0{n 2 ). The main 
reason behind this difference is the following. We have to compute the 'norm' 
of a 2 x 2 matrix over Q p n, which is the only step that requires time 0(n 3 ). 
Harley however only has to compute the norm of an element of Q p ™, and for 
this he uses a classical formula that expresses this norm as a resultant. More 
precisely, let Z p ™ = Z p [y]/f(y) with f(y) a Teichmiiller modulus (see Section 
16. I|) . then for a(y) £ Z p ™ we have 

Mz p «/z p {<x(y)) = Res y (a(y),f(y)), 

and this resultant can be evaluated very fast using an adaptation of Moenck's 
XGCD algorithm. All details of Harley's result can be found in Section 3.10 of 
|33j . In |14j we present a variant of our algorithm for genus 1 which has the 
same complexity as Harley's method. This works by 'semi-diagonalising' the 
matrix Ffa) and using the above fast norm computation. An implementation 
of this algorithm allows us to compute the zeta function of a random elliptic 
curve over F 3 ioo in less than one second. 

If the curve has no F^-rational 2-torsion, the parameter A will be in an 
extension field of F g of degree at most 3, and the curves are isomorphic over 
an extension field of degree at most 6 (see Proposition III. 1.7 in If we 

compute the zeta function over this extension field, then it is possible to recover 
the original zeta function in an efficient but nondeterministic way, more details 
can be found in our forthcoming Ph.D. thesis [12j . However, in our paper [14] we 
were able to avoid this problem by choosing better families than the Legendre 
family. 

7.2 More than one parameter families? 

An interesting question would be to see how many hyperelliptic curves we can 
reach with a deformation as described in this paper. More precisely, for getting 
more curves over F q ™ we could increase the base field F q , the T-degree K of 
Q(X, r), and even take the substitution r «— 7 in a bigger field than F q ™. 
Although we have no decisive answer on this matter, the fact that the moduli 
space of hyperelliptic curves of genus g in odd characteristic has dimension 2g — l 
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suggests that this is not possible in a useful way for g > 2. Indeed, the number of 
'families of degree < k' of the form Y 2 = Q(X, T) is clearly at most g( K+1 )( 2 9+ 1 ) , 
as there are 2g + 1 coefficients, all of the form a K T K + a K _ir K ~ 1 + ■ • • with all 
oti € V q . Every family gives no more than q n curves, so an upper limit for the 
number of curves reached with a degree k deformation is q( K +~ L )( 2 9+ 1 ) q n . The 
dimension of the moduli space gives that there exist about g( 2 9 -1 )™ hyperelliptic 
curves of genus g over ¥ q n , hence for k and g > 2 fixed we really can have only a 
very small part of the possible curves. Of course it would be possible to increase 
a and/or k to 0(n), but this would not give a good algorithm (at all). 

8 Implementation results 

Until now we have considered only theoretical complexities, but in order to be 
of any practical use, the algorithm has to work at a decent speed in a concrete 
implementation. We have implemented a version of the algorithm with theoret- 
ical time complexity 0(n 3 ) and space 0(n 2 ), using the computational algebra 
system MagmgQ V2.12-14. 

Our implementation^] does not use the basis {X l dX/y/Q}i for H^ w , but 

instead we use {X^X/y/Q which gives — as pointed out by Kedlaya [TTl 
Section 3.5] — an integral matrix of Frobenius. This gives an easier to imple- 
ment and faster program (although not asymptotically). We require the base 
polynomial Q(X, T) to be defined over a prime field (of odd characteristic), and 
use a naive lift for Q q and Q qn and a Teichmuller lift for 7. The situation T = 
is handled by Harrison's implementation of Kedlaya's algorithm as it is built-in 
in Magma. We will present a few concrete results of this algorithm, all achieved 
on a Pentium IV running at 2.4 GHz, using 1.5 GB of memory with SuSE Linux 

9.0 as operating system. 

In trying to speed up the algorithm, we found that the crucial parameter is 
M, which depends on the convergence behavior of the entries of F(T). Experi- 
mentally a good value turned out to be M = (3n/2 + 10)pg/3, but some more 
tuning of this parameter may make the algorithm faster. 

8.1 Elliptic curves 

As mentioned before, Harley's algorithm has complexity 0(n 2 ), and will hence 
be faster than ours, but to our knowledge it has not yet been implemented 
in Magma for odd characteristic — Magma uses the SEA-algorithm for such 
curveqj. We always work with the Legendre family and a random parameter 
in the finite field. For small fields SEA is faster than the deformation, but for 
field sizes starting from about 3 100 , 5 60 or 7 40 the deformation algorithm is 
substantially faster. A few timing results are gathered in the following table. 
Remark that all times are in seconds. 

7 See http : //magma. maths .usyd . edu.au/. 

8 Available on http://wis.kuleuven.be/algebra/hubrechts/. 

9 An implementation in CH — f- of a faster algorithm is described in 1231 . 
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p\n 


20 


40 


60 


80 


100 


150 


3 


0.95 


5.29 


10.74 


27.75 


40.28 


178 


5 


2.40 


8.01 


16.37 


39.18 


62.79 


245 


29 


22.59 


69.98 


145.35 


279.90 


419 


1300 


107 


158.67 


472.52 


1025.39 


1742.27 


2676 


7669 


233 


520.64 


1619.10 


3379.69 









A parameter 7 S F3500 used 4605 seconds, and to get an idea of the object sizes 
involved: the p-adic accuracy in the beginning of the algorithm was N a = 298, 
the power series in T were computed modulo T Nr with Nr = 3048, and M was 
equal to 760. Most of the time was consumed by computing 7, F{^f) and T . 
It is indeed quite interesting to study which steps in the algorithm take most 
of the time. In the following table F(T) stands for the analytical continuation 
of F(T) (which comes down to the product r M ■ F(Tj), and C^ 1 is short for 





r(r) 


H 


C 


c- 1 


F(0) 


F(r) 


F(r) 


7 


F( 7 ) 


T 


F350 


0.26 





0.18 


0.06 


0.08 


1.19 


0.76 


0.52 


1.38 


3.11 


F3100 


0.26 





0.37 


0.13 


0.14 


3.87 


2.26 


2.65 


6.71 


23.76 


F3150 


0.25 





0.55 


0.19 


0.20 


8.39 


5.37 


12.23 


25.09 


125.5 


F3500 


0.26 





2.36 


0.83 


0.93 


93.03 


55.46 


280.9 


424.9 


3710 


F7100 


0.30 





0.90 


0.15 


0.34 


15.66 


10.67 


6.22 


19.76 


31.61 



This table learns us that if the extension degree increases, the last steps 
(being precisely those which depend on 7) in the algorithm become more time 
consuming. This is to be expected, as they are precisely the steps cubic in n. 
Checking the correctness of the result took at most a few seconds. 

To get an idea of the memory requirements, Magma reported 11.87 MB for 
7 e F3100 and for 7 G F 3 soo the amount was 48 MB. This includes the kernel 
memory, approximately 3 MB. 

8.2 Higher genus 

The dependency on the genus is not as good as in Kedlaya's algorithm, and 
for g > 5 the algorithm is not very useful anymore. In particular the use of 
memory — although being 0(n 2 ) — grows then too big for even very small 
n. We tested a few higher genus curves as follows: Q is defined to be equal to 
X 2g+1 plus (a random polynomial over F p of degree at most 2g) plus T x (such 
a polynomial). For the parameter 7 again a random element is chosen. For these 
higher genus situations it is interesting to take advantage of the deformation in 
order to compute more zeta functions within a family. Therefore we present all 
timing results as x/y where x is the time needed for the precomputation, and 
y the time for one parameter. The second columns below give the memory use 
in MB. 
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p n \g 


2 


3 


4 


3 50 


165/45 


18.88 


1840/219 


61 


13899/745 


175 


(K) 3 50 


166 


24.28 


772 


84 


2070 


142 


3 ioo 


520/281 


28.10 


7062/1402 


109 


46672/4898 


356 


(K) 3 100 


1216 


110.10 


6104 


399 


17483 


723 


g200 


2050/2120 


54.29 


28144/12026 


238 






(K) 3 200 


11785 


635.34 










3 4oo 


8635/20586 


122.54 










5 ioo 


1177/490 


39.73 


17356/2529 


188 


99886/8826 


578 


(K) 5 100 


2796 


221.30 


16082 


606 


46842 


1280 


31 ioo 


23989/3817 


273.30 











The (K) in front of a row means that Kedlaya's algorithm was used for pre- 
cisely the same curve. The most striking difference is the case with genus 2 
over F3200 , where we tried a couple of different random equations, all of which 
gave similar result^. We note also that it was not even possible to try a genus 
2 curve over F3400 with Kedlaya's method, as it would require approximately 
5 GB of memory, eight times the amount used for 3 200 . For higher genus the 
advantage of deformation is more on the level of memory requirement^^ and 
computing within families. 

The conclusion is that for low genus and a field size big enough, a deformation 
algorithm can give a substantial advantage over the 'classical approach', at 
least for curves in certain one parameter families. In particular the memory 
requirements drop dramatically, and after some precomputations it is rather 
efficient to compute concrete zeta functions. 
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